Add complete implementation of the classical PCA algorithm with covar… #10315

nPr0nn · 2024-11-15T17:45:02Z

…iance matrix and power iteration with a very simple test file

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

Description

The current PCA implementation in the repository is incomplete, as outlined in issue #8724. To address this, I implemented a simplified version that computes the covariance matrix and applies the power iteration method to obtain the principal component. This approach produces results consistent with the scipy PCA implementation on the same input matrices.

This is my first time working with ggml, so I would greatly appreciate a review of my implementation to ensure it aligns with best practices for the library.

Additionally, I noticed that @jukofyork used the cross-covariance method in their Python implementation of control-vectors. Inspired by this, I included a utility function to compute the cross-covariance matrix, which could be helpful for anyone exploring that method in this context.

For now, I’ve implemented only the power iteration method since, from my understanding, the focus is on obtaining the eigenvector corresponding to the largest variance on most cases. But would be nice to implement QR decomposition to get all eigenvectors later on.

Tests

A basic test file is included in the cvectors folder and the Makefile as a sanity check. It might be worth considering moving these tests to the appropriate tests folder for consistency with the rest of the project.

…iance matrix and power iteration with a very simple test file

ngxson · 2024-11-15T21:08:43Z

examples/cvector-generator/vanilla_pca.hpp

+    ggml_free(ctx);
+}
+
+static void compute_cross_covariance(struct pca_params &pca_params,


Is this normal that this function is unused?

Yes, I included this function based on @jukofyork's comments about cross-covariance in the issue discussion. While it’s not currently being utilized, it could be used in future improvements of the algorithm implementation

ngxson

LGTM overall. I don't have time to test it yet, but will do another review soon.

examples/cvector-generator/vanilla_pca.hpp

examples/cvector-generator/cvector-generator.cpp

Makefile

…e correct ctx_size and add GGML_ASSERT to check v_output

examples/cvector-generator/mini-tests/test-vanilla-pca.cpp

examples/cvector-generator/pca.hpp

…necessary allocations

ngxson

Looks good, I'll give a try with cvector-generator.cpp tomorrow and will merge after that.

ngxson · 2024-11-19T22:58:06Z

examples/cvector-generator/pca.hpp

+        ggml_status graph_status = ggml_backend_graph_compute(backend, gf);
+
+        // Get graph results (eigenvector and eigenvalue) and store it in b and eigenvalue
+        if(graph_status == GGML_STATUS_SUCCESS){


Suggested change

if(graph_status == GGML_STATUS_SUCCESS){

if (graph_status == GGML_STATUS_SUCCESS) {

Some minor style fix

ngxson · 2024-11-19T22:58:26Z

examples/cvector-generator/pca.hpp

+            eigenvalue = (float)((float*) eigenvalue_tensor->data)[0];
+
+            // Check if the similarity is close enough to 1, if so we converged and should break
+            if(1 - similarity < pca_params.tolerance)


ngxson · 2024-11-20T21:24:32Z

I'm having "tensor buffer not set" while running cvector-generator:

  * frame #0: 0x0000000190862600 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x000000019089af70 libsystem_pthread.dylib`pthread_kill + 288
    frame #2: 0x00000001907a7908 libsystem_c.dylib`abort + 128
    frame #3: 0x00000001000211f8 llama-cvector-generator`ggml_abort(file="ggml/src/ggml-backend.cpp", line=261, fmt=<unavailable>) at ggml.c:169:5 [opt]
    frame #4: 0x0000000100038104 llama-cvector-generator`ggml_backend_tensor_set(tensor=<unavailable>, data=<unavailable>, offset=<unavailable>, size=<unavailable>) at ggml-backend.cpp:261:5 [opt]
    frame #5: 0x00000001001d5d44 llama-cvector-generator`main at pca.hpp:321:9 [opt]
    frame #6: 0x00000001001d572c llama-cvector-generator`main(argc=<unavailable>, argv=<unavailable>) at cvector-generator.cpp:492:9 [opt]
    frame #7: 0x0000000190518274 dyld`start + 2840

The command that I used:

make llama-cvector-generator -j
./llama-cvector-generator -m ../models/meta-llama-3.1-8b-instruct-abliterated.Q4_K_M.gguf -ngl 99

slaren · 2024-11-20T22:22:56Z

This is likely caused by all the direct assignments of tensor->data with malloc. This not compatible with ggml-backend, the tensors need to be allocated into a buffer with ggml_backend_alloc_ctx_tensors or similar.

slaren · 2024-11-20T22:36:11Z

While trying to reproduce this I found other problems:

covariance in run_single_pca is not big enough and causes a buffer overflow
The tensors saved in the gguf do not have names, so it fails when they are added to the gguf file

The last one is especially puzzling to me.. was this tested at all?

nPr0nn · 2024-11-20T22:42:38Z

@ngxson Oof, I'll have to investigate to find out what is going wrong, probably during the weekend, when computing the pca on float matrices during my tests this didn't happen.

@slaren I don't think there is any assignment directly to tensor->data with malloc on the pca.hpp file anymore, I've changed to use ggml_backend_tensor_set and ggml_backend_tensor_get, maybe is something related with this ? Btw I don't understand why the covariance matrix would not be big enough

slaren · 2024-11-20T22:47:50Z

The malloc are in cvector-generator.cpp. This is a possible way to solve that:

diff --git a/examples/cvector-generator/cvector-generator.cpp b/examples/cvector-generator/cvector-generator.cpp
index e7c924fb..17881c3a 100644
--- a/examples/cvector-generator/cvector-generator.cpp
+++ b/examples/cvector-generator/cvector-generator.cpp
@@ -2,6 +2,8 @@
 #include "common.h"
 #include "llama.h"
 #include "ggml.h"
+#include "ggml-cpp.h"
+#include "ggml-alloc.h"

 #include "mean.hpp"
 #include "pca.hpp"
@@ -193,6 +195,7 @@ struct train_context {
     // to easily re-alloc when concat v_diff, we temporary store v_diff in a vector instead of a tensor
     // v_diff_tmp will get converted unto v_diff later on
     std::vector<std::vector<uint8_t>> v_diff_tmp;
+    ggml_backend_buffer_ptr buffer;

     train_context(int n_embd_, int n_layers_) {
         n_embd = n_embd_;
@@ -207,9 +210,9 @@ struct train_context {
             std::vector<uint8_t> empty;
             v_diff_tmp.push_back(empty);
             auto t = ggml_new_tensor_1d(ctx_ggml, GGML_TYPE_F32, n_embd);
-            t->data = malloc(ggml_nbytes(t)); // TODO: get rid of malloc if possible
             v_final.push_back(t);
         }
+        buffer.reset(ggml_backend_alloc_ctx_tensors_from_buft(ctx_ggml, ggml_backend_cpu_buffer_type()));
     }

     // add new rows into existing tensor in v_diff_tmp

slaren · 2024-11-20T22:51:13Z

Btw I don't understand why the covariance matrix would not be big enough

I don't know what's the size supposed to be, but if you build with LLAMA_SANITIZE_ADDRESS=1 you should see the error. It may depend on the model, I used the very small stories260k.gguf from https://huggingface.co/ggml-org/models/tree/main/tinyllamas

nPr0nn · 2024-11-20T23:01:24Z

Oooh the mallocs are in cvector-generator.cpp, seems like this file will need to be refactored then

Thanks! I'll test everything on the weekend and try to reproduce the problem with these models and investigate why the covariance matrix is overflowing to solve it aswell

ngxson · 2024-11-21T10:58:16Z

Please note that some malloc was used in cvector-generator.cpp because some tensor size is unknown from the beginning (i.e. some depends on the number of input tokens)

There are other places that I simply use malloc to temporary store tensor data before writing it to GGUF.

Refactoring those would be nice.

Add complete implementation of the classical PCA algorithm with covar…

5c1d117

…iance matrix and power iteration with a very simple test file

github-actions bot added the examples label Nov 15, 2024

ngxson reviewed Nov 15, 2024

View reviewed changes

examples/cvector-generator/vanilla_pca.hpp Outdated Show resolved Hide resolved

examples/cvector-generator/vanilla_pca.hpp Outdated Show resolved Hide resolved

examples/cvector-generator/cvector-generator.cpp Outdated Show resolved Hide resolved

Makefile Show resolved Hide resolved

Apply suggestions from the PR: employ CPU buffers to copy results, us…

1840df1

…e correct ctx_size and add GGML_ASSERT to check v_output

slaren reviewed Nov 16, 2024

View reviewed changes

Apply suggestions from the PR: refactor test-vanilla-pca and remove u…

82efaaf

…necessary allocations

slaren approved these changes Nov 19, 2024

View reviewed changes

ngxson approved these changes Nov 19, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add complete implementation of the classical PCA algorithm with covar… #10315

Add complete implementation of the classical PCA algorithm with covar… #10315

nPr0nn commented Nov 15, 2024

ngxson Nov 15, 2024

nPr0nn Nov 15, 2024

ngxson left a comment

ngxson left a comment

ngxson Nov 19, 2024

ngxson Nov 19, 2024

ngxson commented Nov 20, 2024

slaren commented Nov 20, 2024

slaren commented Nov 20, 2024

nPr0nn commented Nov 20, 2024

slaren commented Nov 20, 2024

slaren commented Nov 20, 2024

nPr0nn commented Nov 20, 2024

ngxson commented Nov 21, 2024

	if(graph_status == GGML_STATUS_SUCCESS){
	if (graph_status == GGML_STATUS_SUCCESS) {

Add complete implementation of the classical PCA algorithm with covar… #10315

Are you sure you want to change the base?

Add complete implementation of the classical PCA algorithm with covar… #10315

Conversation

nPr0nn commented Nov 15, 2024

Description

Tests

ngxson Nov 15, 2024

Choose a reason for hiding this comment

nPr0nn Nov 15, 2024

Choose a reason for hiding this comment

ngxson left a comment

Choose a reason for hiding this comment

ngxson left a comment

Choose a reason for hiding this comment

ngxson Nov 19, 2024

Choose a reason for hiding this comment

ngxson Nov 19, 2024

Choose a reason for hiding this comment

ngxson commented Nov 20, 2024

slaren commented Nov 20, 2024

slaren commented Nov 20, 2024

nPr0nn commented Nov 20, 2024

slaren commented Nov 20, 2024

slaren commented Nov 20, 2024

nPr0nn commented Nov 20, 2024

ngxson commented Nov 21, 2024